*Do-file for data preparation for the replication package for the paper "Precolonial Elites and Colonial Redistribution of Political Power"
*Description: This do-file produces the Working Dataset 1 "WD1_Egypt_MPs_1824_1923.dta" from the raw dataset "RD1_Egypt_MPs_1824_1923.dta":
*(1) It starts from "RD1_Egypt_MPs_1824_1923.dta".
*(2) It conducts data cleaning on "RD1_Egypt_MPs_1824_1923.dta" using "RD3_1882_1897.dta" and "RD4_1882_1907.dta" to produce an intermediate dataset "ID1_Egypt_MPs_1824_1923.dta". 
*(3) It merges "ID1_Egypt_MPs_1824_1923.dta" with province-level secondary dataset "RD2_Regressors_Province.dta" to produce "WD1_Egypt_MPs_1824_1923.dta".

version 17.0


***CHANGE WORKING DIRECTORY



*-------------------------------------------------------------------------------
*----------------------------Table of contents----------------------------------
*-------------------------------------------------------------------------------
  
/*

 I. PREPARATION OF RAW DATASET 1 (RD1_Egypt_MPs_1824_1923) TO PRODUCE Intermediate DATASET 1 (ID1_Egypt_MPs_1824_1923)

	  I.1. Reassignment of MP constituency based on MP history
	  I.2. Reassignment of MP occupational title based on MP history
	  I.3. Reassignment of MP honorific title based on MP history
	  I.4. Classification of MPs into Social Classes
	  I.5. New Entrant/Incumbent, Elected/Appointed, Lower/Upper House

II. MERGER OF INTERMEDIATE DATASET 1 (ID1_Egypt_MPs_1824_1923) WITH PROVINCE-LEVEL COVARIATES (RD2_Regressors_Province) TO PRODUCE WORKING DATASET 1 (WD1_Egypt_MPs_1824_1923)
    
*/



*---------------------------------------------------------------------------------------------------------------------------------
*---------------------------------------------------------------------------------------------------------------------------------
*-----I. PREPARATION OF RAW DATASET 1 (RD1_Egypt_MPs_1824_1923) TO PRODUCE INTERMEDIATE DATASET 1 (ID1_Egypt_MPs_1824_1923)-------
*---------------------------------------------------------------------------------------------------------------------------------
*---------------------------------------------------------------------------------------------------------------------------------


*-------------------------------------------------------------------------------
*-----------I.1. Reassignment of MP constituency based on MP history------------
*-------------------------------------------------------------------------------

use "RD1_Egypt_MPs_1824_1923.dta", clear

*Create Identifiers for level of geographic assigment of MP-session observations 
*Remark 1: We use the code of the closest population census (1882, 1897, 1907) for geographic assigment in RD1
*Remark 2: We classify MP-session observations into four levels of gepgraphic assignment: Missing, Province, District, and Village

gen Missing = CensusCode==""
gen Province = (substr(CensusCode,3,2)=="00" & Missing == 0)
gen District = (substr(CensusCode,5,2)=="00" & Missing == 0 & Province==0)
gen Village = Missing==0 & Province ==0 & District ==0

label variable Missing "=1 if MP's Original Geographic Assignment is Missing"
label variable Province "=1 if MP's Original Geographic Assignment Level is Province"
label variable District "=1 if MP's Original Geographic Assignment Level is District"
label variable Village "=1 if MP's Original Geographic Assignment Level is Village"

tab Village //339
tab District //77
tab Province //502
tab Missing //184


*Create Identifier for first observation for each MP

sort MP_UID cycle
bysort MP_UID: gen MP_first = _n == 1

 
*Create categorical variable to identify the original level of geographic assignment 

gen Geography = 4 if CensusCode==""
replace Geography = 3 if (CensusCode!="" & substr(CensusCode,3,2)=="00")
replace Geography = 2 if (CensusCode!="" & substr(CensusCode,3,2)!="00" & substr(CensusCode,5,2)=="00")
replace Geography = 1 if (CensusCode!="" & substr(CensusCode,3,2)!="00" & substr(CensusCode,5,2)!="00")

label define GeographyLabel 4 "Missing" 3 "Province" 2 "District" 1 "Village"
label values Geography GeographyLabel

bysort MP_UID: egen Geography_min = min(Geography)
bysort MP_UID: egen Geography_max = max(Geography)

label values Geography_min GeographyLabel
label values Geography_max GeographyLabel

label variable MP_first "MP's First Session in Parliament"
label variable Geography "MP's Original Level of Geographic Assignment"
label variable Geography_min "MP's Lowest Level of Geographic Assignment"
label variable Geography_max "MP's Highest Level of Geographic Assignment"

tab Geography_min Geography_max if MP_first
tab Geography_min Geography_max


*STEP A: Reassignment of MP location by assigning the earliest non-missing most detailed location code of the MP_UID (in level: village then district then province) to all other observations of that MP. This ensures consistent location (census code) within MP_UID.

*Create STEP A geographic assigment variables (census code)

gen CensusCodeNew = CensusCode
gsort MP_UID Village District Province -cycle 
by MP_UID: replace CensusCodeNew = CensusCode[_N] if !Village & !(District & District[_N]) & !(Province & Province[_N]) & !(Missing & Missing[_N]) | (Village & Village[_N])


*Create identifiers for observations whose geographic assigment changed after STEP A

gen obsattributed1 = CensusCodeNew != CensusCode
gen obsattributed1_missing = CensusCode!=CensusCodeNew & CensusCode == ""
gen obsattributed1_detail = CensusCode!=CensusCodeNew & CensusCode != ""


*Create identifiers for MPs whose geographic assigment changed after STEP A

foreach x in obsattributed1 obsattributed1_missing obsattributed1_detail{
	bysort MP_UID: egen MP`x' = max(`x')
}


*Create identifiers for MPs that had a REAL switch in constituency after STEP A

gen switchprovince1 = substr(CensusCode,1,2) != substr(CensusCodeNew,1,2) & CensusCode!="" & CensusCodeNew!="" & MPobsattributed1
gen switchdistrict1 = substr(CensusCode,1,4) != substr(CensusCodeNew,1,4) & substr(CensusCode,3,2)!="00" & substr(CensusCodeNew,3,2)!="00" & CensusCode!="" & CensusCodeNew!="" & MPobsattributed1
gen switchvillage1 = CensusCode != CensusCodeNew & substr(CensusCode,5,2)!="00" & substr(CensusCodeNew,5,2)!="00" & CensusCode!="" & CensusCodeNew!="" & MPobsattributed1

foreach x in switchprovince1 switchdistrict1 switchvillage1{
	bysort MP_UID: egen MP`x' = max(`x')
}

drop switchprovince1 switchdistrict1 switchvillage1


*Create STEP A geographic assigment variables (1996 census code, census year)

gen CensusCode_1996New = CensusCode_1996
gsort MP_UID Village District Province -cycle 
by MP_UID: replace CensusCode_1996New = CensusCode_1996New[_N] if MPobsattributed1

gen CensusYearNew = CensusYear 
gsort MP_UID Village District Province -cycle 
by MP_UID: replace CensusYearNew = CensusYearNew[_N] if MPobsattributed1

 
*Change identifiers for level of geographic assigment of MP-session observations 

replace Missing = CensusCodeNew==""
replace Province = (substr(CensusCodeNew,3,2)=="00" & Missing == 0)
replace District = (substr(CensusCodeNew,5,2)=="00" & Missing == 0 & Province==0)
replace Village = Missing==0 & Province ==0 & District ==0


*Creating categorical variable to identify the level of Geographic Assignment after STEP A

gen Geography1 = 4 if CensusCodeNew==""
replace Geography1 = 3 if (CensusCodeNew!="" & substr(CensusCodeNew,3,2)=="00")
replace Geography1 = 2 if (CensusCodeNew!="" & substr(CensusCodeNew,3,2)!="00" & substr(CensusCodeNew,5,2)=="00")
replace Geography1 = 1 if (CensusCodeNew!="" & substr(CensusCodeNew,3,2)!="00" & substr(CensusCodeNew,5,2)!="00")

label values Geography1 GeographyLabel

bysort MP_UID: egen Geography1_min = min(Geography1)
bysort MP_UID: egen Geography1_max = max(Geography1)

label values Geography1_min GeographyLabel
label values Geography1_max GeographyLabel

label variable CensusCodeNew "Geograhic Code in Closest Population Census (STEP A)"
label variable CensusYearNew "Year of Closest Population Census (STEP A)"
label variable CensusCode_1996New "Geograhic Code in 1996 Population Census (STEP A)"
label variable obsattributed1 "=1 if Observation's Geographic Assignment Changed After STEP A"
label variable obsattributed1_missing "=1 if Observation's Geographic Assignment Changed from Missing After STEP A"
label variable obsattributed1_detail "=1 if Observation's Geographic Assignment Became Less Aggregated After STEP A"
label variable MPswitchprovince1 "=1 if MP Switched Province After STEP A"
label variable MPswitchdistrict1 "=1 if MP Switched District After STEP A"
label variable MPswitchvillage1 "=1 if MP Switched Village After STEP A"
label variable MPobsattributed1 "=1 if MP's Geographic Assignment Changed At Least Once After STEP A"
label variable MPobsattributed1_missing "=1 if MP's Geographic Assignment Changed from Missing At Least Once After STEP A"
label variable MPobsattributed1_detail "=1 if MP's Geographic Assignment Became Less Aggregated At Least Once After STEP A"
label variable Geography1 "MP's Level of Geographic Assignment (STEP A)"
label variable Geography1_min "MP's Lowest Level of Geographic Assignment (STEP A)"
label variable Geography1_max "MP's Highest Level of Geographic Assignment (STEP A)"

tab Geography1 if MP_first
tab Geography1 
tab MPobsattributed1 if MP_first //52
tab MPobsattributed1 //164
tab MPobsattributed1_missing if MP_first //11
tab MPobsattributed1_missing //39
tab MPobsattributed1_detail if MP_first //41
tab MPobsattributed1_detail //125

//164 observations (52 unique MPs) for whom at least one observation within MP_UID changed location after STEP A. 125 observations (41 unique MPs) with non-missing location that were assigned a more detailed location based on other observations of the same MP. 39 observations (11 unique MPs) with missing location that were assigned any location based on other observations of the same MP.

sort MP_UID cycle

***There are 52 MPs (164 observations) whose geographic assignment changed after STEP A. In the next step, we determine if these MPs switched constituency or were rather assigned a more detailed constituency.

***There are 4 MPs (13 observations) who switched constituency after STEP A: 1 MP (5 obs) switched province, and 3 MPs (8 obs) switched both district and village within the same province. By inspecting these cases, it turns out that the 1 province-switching MP is because Rosetta changed province code between 1882 and 1897 censuses, so it is NOT a switch according to the 1996 census code. The other three MPs are real switches, but since they are within the same province, they do not alter the cotton suitability assignment.

tab Village //401
tab District //81
tab Province //457
tab Missing //163


*STEP B: Reassignment of MP location by assigning the earliest non-missing location code of the MPD_ID to all MPs with missing location within the dynasty: We do that at the province level only. We assign dynasty's province code to MPs in the same dynasty with missing location, only if all MPs with non-missing location from the dynasty have the same province code in the 1996 census

*Create identifier for dynasties (family names) whose MPs belong to different provinces
gsort MPD_ID Village District Province -cycle
by MPD_ID: gen diffprovince_ = substr(CensusCode_1996New,1,2)!=substr(CensusCode_1996New[_N],1,2) & !Missing
by MPD_ID: egen diffprovince = max(diffprovince_)
drop diffprovince_

*Create STEP B geographic assigment variables (census code, census year, and 1996 census code)

gsort MPD_ID Village District Province -cycle
by MPD_ID: gen CensusCodeNew2 = substr(CensusCodeNew[_N],1,2)+"0000" if (Missing==1) & (Province[_N]==1 | District[_N]==1 | Village[_N]==1) & MPD_ID!=. & !diffprovince
by MPD_ID: gen CensusCode_1996New2 = substr(CensusCode_1996New[_N],1,2)+"0000" if (Missing==1) & (Province[_N]==1 | District[_N]==1 | Village[_N]==1) & MPD_ID!=. & !diffprovince
by MPD_ID: gen CensusYearNew2 = CensusYearNew[_N] if (Missing==1) & (Province[_N]==1 | District[_N]==1 | Village[_N]==1) & MPD_ID!=. & !diffprovince
replace CensusCodeNew2 = CensusCodeNew if CensusCodeNew2==""
replace CensusCode_1996New2 = CensusCode_1996New if CensusCode_1996New2==""
replace CensusYearNew2 = CensusYearNew if CensusYearNew2==.


*Create identifiers for observations whose geographic assigment changed between STEP A and STEP B

gen obsattributed2 = CensusCodeNew2 != CensusCodeNew
gen obsattributed2_missing = CensusCodeNew2!=CensusCodeNew & CensusCodeNew == ""
gen obsattributed2_detail = CensusCodeNew2!=CensusCodeNew & CensusCodeNew != ""


*Create identifiers for MPs whose geographic assigment changed between STEP A and STEP B

foreach x in obsattributed2 obsattributed2_missing obsattributed2_detail{
		bysort MP_UID: egen MP`x' = max(`x')
}


*Create identifiers for MPs that had a REAL switch in constituency after STEP B

gen switchprovince2 = substr(CensusCodeNew,1,2) != substr(CensusCodeNew2,1,2) & CensusCodeNew!="" & CensusCodeNew2!="" & MPobsattributed2
gen switchdistrict2 = substr(CensusCodeNew,1,4) != substr(CensusCodeNew2,1,4) & substr(CensusCodeNew,3,2)!="00" & substr(CensusCodeNew2,3,2)!="00" & CensusCodeNew!="" & CensusCodeNew2!="" & MPobsattributed2
gen switchvillage2 = CensusCodeNew != CensusCodeNew2 & substr(CensusCodeNew,5,2)!="00" & substr(CensusCodeNew2,5,2)!="00" & CensusCodeNew!="" & CensusCodeNew2!="" & MPobsattributed2

foreach x in switchprovince2 switchdistrict2 switchvillage2{
		bysort MP_UID: egen MP`x' = max(`x')
}

drop switchprovince2 switchdistrict2 switchvillage2


*Change identifiers for level of geographic assigment of MP-session observations 

replace Missing = CensusCodeNew2==""
replace Province = (substr(CensusCodeNew2,3,2)=="00" & Missing == 0)
replace District = (substr(CensusCodeNew2,5,2)=="00" & Missing == 0 & Province==0)
replace Village = Missing ==0 & Province ==0 & District ==0


*Creating categorical variable to identify the level of Geographic Assignment after STEP B

gen Geography2 = 4 if CensusCodeNew2==""
replace Geography2 = 3 if (CensusCodeNew2!="" & substr(CensusCodeNew2,3,2)=="00")
replace Geography2 = 2 if (CensusCodeNew2!="" & substr(CensusCodeNew2,3,2)!="00" & substr(CensusCodeNew2,5,2)=="00")
replace Geography2 = 1 if (CensusCodeNew2!="" & substr(CensusCodeNew2,3,2)!="00" & substr(CensusCodeNew2,5,2)!="00")

label values Geography2 GeographyLabel

bysort MP_UID: egen Geography2_min = min(Geography2)
bysort MP_UID: egen Geography2_max = max(Geography2)

label values Geography2_min GeographyLabel
label values Geography2_max GeographyLabel

label variable CensusCodeNew2 "Geograhic Code in Closest Population Census (STEP B)"
label variable CensusYearNew2 "Year of Closest Population Census (STEP B)"
label variable CensusCode_1996New2 "Geograhic Code in 1996 Population Census (STEP B)"
label variable obsattributed2 "=1 if Observation's Geographic Assignment Changed After STEP B"
label variable obsattributed2_missing "=1 if Observation's Geographic Assignment Changed from Missing After STEP B"
label variable obsattributed2_detail "=1 if Observation's Geographic Assignment Became Less Aggregated After STEP B"
label variable MPswitchprovince2 "=1 if MP Switched Province After STEP B"
label variable MPswitchdistrict2 "=1 if MP Switched District After STEP B"
label variable MPswitchvillage2 "=1 if MP Switched Village After STEP B"
label variable MPobsattributed2 "=1 if MP's Geographic Assignment Changed At Least Once After STEP B"
label variable MPobsattributed2_missing "=1 if MP's Geographic Assignment Changed from Missing At Least Once After STEP B"
label variable MPobsattributed2_detail "=1 if MP's Geographic Assignment Became Less Aggregated At Least Once After STEP B"
label variable Geography2 "MP's Level of Geographic Assignment (STEP B)"
label variable Geography2_min "MP's Lowest Level of Geographic Assignment (STEP B)"
label variable Geography2_max "MP's Highest Level of Geographic Assignment (STEP B)"

tab Geography2 if MP_first
tab Geography2 
tab MPobsattributed2 if MP_first //15
tab MPobsattributed2 //27
tab MPobsattributed2_missing if MP_first //15
tab MPobsattributed2_missing //27
tab MPobsattributed2_detail if MP_first //0
tab MPobsattributed2_detail //0

//27 observations (15 unique MPs) or whom at least one observation within MP_UID changed location after STEP B. All these observations had missing location and were assigned a province based on other MPs in the same dynasty.


sort MPD_ID MP_UID cycle

***There are 15 MPs (27 observations) whose geographic assignment changed after STEP B. In the next step, we determine if these MPs switched constituency or were rather assigned a more detailed constituency.

***There are 0 MPs (0 observations) who switched constituency after STEP B. This is because STEP B assigns province code to MPs with missing location.


tab Village //401
tab District //81
tab Province //484 (up from 457)
tab Missing //136 (down from 163)



*Assign the 1882 census codes following the imputation in STEP A and STEP B

*** VILLAGE: For MPs localized at the village level, we identify the population census used in the coding (1882, 1897, 1907, or 1917):

tab CensusYearNew2 if Village
tab CensusCodeNew2 if Village
*Remark 1: There are 401 observations localized at the village level. The village-level localized observations are all localized using the 1882 census codes. So, the 1882 census codes are the same as CensusCodeNew2 for these observations.
*Remark 2: The village-level localized observations are all located in rural provinces in 1882.

gen code_1882 = CensusCodeNew2 if CensusYearNew2==1882 & Village == 1 //401

*** DISTRICT: For MPs localized at the district level, we identify the population census used in the coding (1882, 1897, 1907, or 1917):

tab CensusYearNew2 if District
tab CensusCodeNew2 if District & CensusYearNew2==1882 
tab CensusCodeNew2 if District & CensusYearNew2==1897 
tab CensusCodeNew2 if District & CensusYearNew2==1907 

*Remark 3: There are 81 observations localized at the district level. The district-level localized observations are: 65 localized using the 1882 census codes, 14 using the 1897 census codes, 2 using the 1907 census codes.
*Remark 4: The district-level localized observations in 1882 are all located in rural provinces. However, the district-level localized observations in 1897 contain 5 observations that map to Rosetta, an urban province in 1882. the district-level localized observations in 1907 contain 1 observation that map to Rosetta, an urban province in 1882, and 1 observation that map to Damietta, an urban province in 1882.

gen Code_1897 = CensusCodeNew2 if CensusYearNew2==1897 & District==1 // 14 observations
gen Code_1907 = CensusCodeNew2 if CensusYearNew2==1907 & District==1 // 2 observations


*We assign the 1882 district census codes to observations localized at the district level using the 1897 and 1907 codes:
merge m:1 Code_1897 using RD3_1882_1897, keepusing(Code_1897 districtcode_mod1) keep(match master match_update) nogen //14 matched
merge m:1 Code_1907 using RD4_1882_1907, keepusing(Code_1907 districtcode_mod1) keep(match master match_update) nogen update replace //2 matched

gen districtcode_1882 = substr(CensusCodeNew2,1,4) if CensusYearNew2==1882 & (Village==1 | District==1) //466: 401 Village + 65 District
replace districtcode_1882 = string(districtcode_mod1) if CensusYearNew2!=1882 & (Village==1 | District==1)
replace districtcode_1882 = "0500" if CensusCodeNew2 == "180600" & (CensusYearNew2 == 1897 | CensusYearNew2 == 1907)
replace districtcode_1882 = "1101" if CensusCodeNew2 == "120400" & CensusYearNew2 == 1907

drop Code_1897 Code_1907
*Remark 5: There are 7 observations that are matched to urban provinces in 1882: 1 observation with code 1204 in 1907 matched to 11 (Damietta) in 1882, and 6 observations with code "1806" in 1897 and 1907 matched to 5 (Rosetta) in 1882.


*** PROVINCE: For MPs localized at the province level, we identify the population census used in the coding (1882, 1897, 1907, or 1917):

tab CensusYearNew2 if Province
tab CensusCodeNew2 if Province
tab CensusCodeNew2 if Province & CensusYearNew2==1882 
tab CensusCodeNew2 if Province & CensusYearNew2==1897 
tab CensusCodeNew2 if Province & CensusYearNew2==1907 
tab CensusCodeNew2 if Province & CensusYearNew2==1917 

*Remark 6: There are 484 observations localized at the province level: 138 localized using the 1882 census codes, 193 localized using the 1897 census codes, 79 localized using the 1907 census codes, 74 localized using the 1917 census codes.
*Remark 7: The province-level localized observations are: 374 located in rural provinces and 110 located in urban provinces.
*Remark 8: Because province codes in 1882, 1897, 1907, 1917 censuses are the same, we use the contemporaneous province codes directly.

gen provincecode_1882 = substr(CensusCodeNew2,1,2) if Village==1 | District==1 | Province==1 //966: 401 Village + 81 District + 484 Province
replace provincecode_1882 = "11" if CensusCodeNew2 == "120400" & CensusYearNew2 == 1907 //1
replace provincecode_1882 = "05" if CensusCodeNew2 == "180600" & (CensusYearNew2 == 1897 | CensusYearNew2 == 1907) //6


replace code_1882 = districtcode_1882+"00" if District == 1 //81
replace code_1882 = provincecode_1882+"0000" if Province == 1 //484

replace districtcode_1882 = substr(code_1882,1,4) if Village == 1 //0
replace districtcode_1882 = provincecode_1882+"00" if Province == 1 //484

replace provincecode_1882 = substr(code_1882,1,2) if Village == 1 | District == 1 //0


*Urban/Rural Status: This uses the 1882 census boundaries.

*Define urban/rural status of MPs according to the 1882 census administrative division
gen urban = provincecode_1882 == "01" | provincecode_1882 == "02" | provincecode_1882 == "04" | provincecode_1882 == "05" | provincecode_1882 == "06" | provincecode_1882 == "07" | provincecode_1882 == "11"
replace urban = . if Missing == 1

destring (provincecode_1882 districtcode_1882), replace

label variable code_1882 "Village Code in 1882 Population Census"
label variable districtcode_1882 "District Code in 1882 Population Census"
label variable provincecode_1882 "Province Code in 1882 Population Census"


*-------------------------------------------------------------------------------
*-------I.2. Reassignment of MP occupational title based on MP history----------
*-------------------------------------------------------------------------------


*Modify occupation code by assigning MP's initial occupation code that appears in MP's first parliament with non-missing occupation to the MP's other (earlier and later) terms in parliament

gen MP_Occupation_Code_NM = MP_Occupation_Code != .

gen MP_Occupation_Code_Mod = MP_Occupation_Code
gsort MP_UID MP_Occupation_Code_NM -cycle 
by MP_UID: replace MP_Occupation_Code_Mod=MP_Occupation_Code[_N] //114 changes

drop MP_Occupation_Code_NM

*Creating Identifiers for MPs who Changed Occupational Title

sort MP_UID cycle
gen occupationshift = MP_Occupation_Code != MP_Occupation_Code_Mod
gen occupationshift_missing = (MP_Occupation_Code != MP_Occupation_Code_Mod) & MP_Occupation_Code_Mod != . & MP_Occupation_Code == .
gen occupationshift_real = (MP_Occupation_Code != MP_Occupation_Code_Mod) & MP_Occupation_Code_Mod != . & MP_Occupation_Code != .

bysort MP_UID: egen MPoccupationshift = max(occupationshift)
bysort MP_UID: egen MPoccupationshift_missing = max(occupationshift_missing)
bysort MP_UID: egen MPoccupationshift_real = max(occupationshift_real)

gen MPoccupationshift_missingonly = MPoccupationshift_missing & !MPoccupationshift_real

tab1 occupationshift occupationshift_missing occupationshift_real
tab1 MPoccupationshift MPoccupationshift_missing MPoccupationshift_missingonly MPoccupationshift_real if MP_first
tab MP_Occupation_Code_Mod MP_Occupation_Code if occupationshift_real

***There are 70 MPs (114 observations) where the MP's "initial" occupational code in their first term in parliament is different from their current occupational code. Out of these, 53 MPs (77 observations) have a missing current occupational code, and were assigned a non-missing initial occupational code. The remaining 25 MPs (37 observations) witnessed at least one REAL shift, where the current occupational code is not missing, and is different from the initial occupational code. These 37 observations are distributed as follows:
*32 observations shifted from "58410" (Village Headman) to "61120" (Notable)
*1 shifted from "58410" (Village Headman) to "20210" (Government Administrator)
*3 shifted from "41025" (Business) to "61120" (Notable)
*1 shifted from "20210" (Government Administrator) to "61120" (Notable)

	 
*Generate dummy variables for occupational groups based on current occupational code 
	 
gen professional= MP_Occupation_Code==-1 | MP_Occupation_Code==-10 ///
  | MP_Occupation_Code==01340 | MP_Occupation_Code==02740 | MP_Occupation_Code==02000 ///
  | MP_Occupation_Code==02930 | MP_Occupation_Code==04215 | MP_Occupation_Code==05190 ///
  | MP_Occupation_Code==05280 | MP_Occupation_Code==05290 | MP_Occupation_Code==06100 ///
  | MP_Occupation_Code==06310 | MP_Occupation_Code==06510 | MP_Occupation_Code==06710 ///
  | MP_Occupation_Code==08420 | MP_Occupation_Code==09010 | MP_Occupation_Code==11010 ///
  | MP_Occupation_Code==11020 | MP_Occupation_Code==12110 | MP_Occupation_Code==12210 ///
  | MP_Occupation_Code==13100 | MP_Occupation_Code==13240 | MP_Occupation_Code==13320 ///
  | MP_Occupation_Code==13920 | MP_Occupation_Code==13940 | MP_Occupation_Code==13950 /// 
  | MP_Occupation_Code==13990 | MP_Occupation_Code==15900 | MP_Occupation_Code==15915 ///
  | MP_Occupation_Code==15920 | MP_Occupation_Code==15955 | MP_Occupation_Code==16160 ///
  | MP_Occupation_Code==16350 | MP_Occupation_Code==16350 | MP_Occupation_Code==17920 ///
  | MP_Occupation_Code==18050 | MP_Occupation_Code==19120 | MP_Occupation_Code==19270 ///
  | MP_Occupation_Code==19290 | MP_Occupation_Code==19310 | MP_Occupation_Code==19350 ///
  | MP_Occupation_Code==19390 | MP_Occupation_Code==19940 | MP_Occupation_Code==21940 ///
  | MP_Occupation_Code==21950 | MP_Occupation_Code==21990 | MP_Occupation_Code==22610 ///
  | MP_Occupation_Code==33140 | MP_Occupation_Code==33940 | MP_Occupation_Code==44220 /// 
  | MP_Occupation_Code==49090 | MP_Occupation_Code==05400 | MP_Occupation_Code==74500 /// 
  | MP_Occupation_Code==75600 | MP_Occupation_Code==77310 | MP_Occupation_Code==85510 ///
  | MP_Occupation_Code==85560 | MP_Occupation_Code==94980 | MP_Occupation_Code==98590 ///
  | MP_Occupation_Code==99900
   replace professional=. if MP_Occupation_Code==. 
  
     /* previous employee, law student, Weather scientist, specialist, Sailer, 
	 agriculture specialist, livestock specialist, engineer, dentist, vetarian, 
	 pharmasist, doctor, IT Specialist, economist, Accountant, Auditor, lawyer, 
	 judges, university teacher, school teacher, school manager, PR manager, 
	 antique restoration, TV producer, TV presenter, Radio presenter, editor, 
	 jornalist, newspaper owner, sports trainer, librarian, political scientist,
	 researcher, social worker, advertising manager, HR manager, Financial manager, 
	 project manager, inspector, banker, financial adviser, secretariat, 
	 sales representative, fani, petrochemicals worker, textile worker, 
	 butcher, electician, maintenance worker, quality inspector, driver, worker */

	 /* 1824-1923: Judge (8), Journal Editor (1) */
	 
gen businessman=  MP_Occupation_Code==03390 | MP_Occupation_Code==21240 ///
  | MP_Occupation_Code==04 | MP_Occupation_Code==12510 | MP_Occupation_Code==21110 /// 
  | MP_Occupation_Code==21300 | MP_Occupation_Code==41020 | MP_Occupation_Code==41025 
     replace businessman=. if MP_Occupation_Code==. 

     /* contractor, businessman, wakeel of company, bank manager, sales manager, merchant*/

	 /* 1824-1923: Merchant (5) */

gen religious_elite=  MP_Occupation_Code==14120 | MP_Occupation_Code==14140 ///
  | MP_Occupation_Code==14190 | MP_Occupation_Code==14990
     replace religious_elite=. if MP_Occupation_Code==. 

	/* Azhar teacher, mofti/azhar scientist, wakeel awkaf, other religous workers*/

	/* 1824-1923: Minister of Religion (4): Mufti, Naqib al-Ashraf, 'Alim */

gen top_bureaucrat= MP_Occupation_Code==20110
     replace top_bureaucrat=. if MP_Occupation_Code==. 

	/* 1824-1923: Legislative Official (6)*/

gen bureaucrat= MP_Occupation_Code==20210 | MP_Occupation_Code==21000 | MP_Occupation_Code==31000 | MP_Occupation_Code==31010 ///
  | MP_Occupation_Code==31030 | MP_Occupation_Code==31040 | MP_Occupation_Code==31090
     replace bureaucrat=. if MP_Occupation_Code==. 

	/* 1824-1923: Government Administrator (37), Manager (10), Government Executive Official (2)*/

gen village_headman= MP_Occupation_Code==58410 
   replace village_headman=. if MP_Occupation_Code==. 

     /* Sheikh Eklim/Omda*/
	 /* 1824-1923: Village headman (348) */

gen ayan= MP_Occupation_Code==61120 | MP_Occupation_Code==61110
   replace ayan=. if MP_Occupation_Code==. 

     /* Ayan: notable*/
	 /* 1824-1923: Notable (303) */

gen missing= MP_Occupation_Code==. 

 	 
*Generate dummy variables for occupational groups	based on initial occupational code
	 
gen professional_mod= MP_Occupation_Code_Mod==-1 | MP_Occupation_Code_Mod==-10 ///
  | MP_Occupation_Code_Mod==01340 | MP_Occupation_Code_Mod==02740 | MP_Occupation_Code_Mod==02000 ///
  | MP_Occupation_Code_Mod==02930 | MP_Occupation_Code_Mod==04215 | MP_Occupation_Code_Mod==05190 ///
  | MP_Occupation_Code_Mod==05280 | MP_Occupation_Code_Mod==05290 | MP_Occupation_Code_Mod==06100 ///
  | MP_Occupation_Code_Mod==06310 | MP_Occupation_Code_Mod==06510 | MP_Occupation_Code_Mod==06710 ///
  | MP_Occupation_Code_Mod==08420 | MP_Occupation_Code_Mod==09010 | MP_Occupation_Code_Mod==11010 ///
  | MP_Occupation_Code_Mod==11020 | MP_Occupation_Code_Mod==12110 | MP_Occupation_Code_Mod==12210 ///
  | MP_Occupation_Code_Mod==13100 | MP_Occupation_Code_Mod==13240 | MP_Occupation_Code_Mod==13320 ///
  | MP_Occupation_Code_Mod==13920 | MP_Occupation_Code_Mod==13940 | MP_Occupation_Code_Mod==13950 /// 
  | MP_Occupation_Code_Mod==13990 | MP_Occupation_Code_Mod==15900 | MP_Occupation_Code_Mod==15915 ///
  | MP_Occupation_Code_Mod==15920 | MP_Occupation_Code_Mod==15955 | MP_Occupation_Code_Mod==16160 ///
  | MP_Occupation_Code_Mod==16350 | MP_Occupation_Code_Mod==16350 | MP_Occupation_Code_Mod==17920 ///
  | MP_Occupation_Code_Mod==18050 | MP_Occupation_Code_Mod==19120 | MP_Occupation_Code_Mod==19270 ///
  | MP_Occupation_Code_Mod==19290 | MP_Occupation_Code_Mod==19310 | MP_Occupation_Code_Mod==19350 ///
  | MP_Occupation_Code_Mod==19390 | MP_Occupation_Code_Mod==19940 | MP_Occupation_Code_Mod==21940 ///
  | MP_Occupation_Code_Mod==21950 | MP_Occupation_Code_Mod==21990 | MP_Occupation_Code_Mod==22610 ///
  | MP_Occupation_Code_Mod==33140 | MP_Occupation_Code_Mod==33940 | MP_Occupation_Code_Mod==44220 /// 
  | MP_Occupation_Code_Mod==49090 | MP_Occupation_Code_Mod==05400 | MP_Occupation_Code_Mod==74500 /// 
  | MP_Occupation_Code_Mod==75600 | MP_Occupation_Code_Mod==77310 | MP_Occupation_Code_Mod==85510 ///
  | MP_Occupation_Code_Mod==85560 | MP_Occupation_Code_Mod==94980 | MP_Occupation_Code_Mod==98590 ///
  | MP_Occupation_Code_Mod==99900
   replace professional_mod=. if MP_Occupation_Code_Mod==. 
  
	 /* 1824-1923: Judge (8), Journal Editor (1) */

gen businessman_mod=  MP_Occupation_Code_Mod==03390 | MP_Occupation_Code_Mod==21240 ///
  | MP_Occupation_Code_Mod==04 | MP_Occupation_Code_Mod==12510 | MP_Occupation_Code_Mod==21110 /// 
  | MP_Occupation_Code_Mod==21300 | MP_Occupation_Code_Mod==41020 | MP_Occupation_Code_Mod==41025 
     replace businessman_mod=. if MP_Occupation_Code_Mod==. 
	
    /* 1824-1923: Merchant (11) */


gen religious_elite_mod=  MP_Occupation_Code_Mod==14120 | MP_Occupation_Code_Mod==14140 ///
  | MP_Occupation_Code_Mod==14190 | MP_Occupation_Code_Mod==14990
     replace religious_elite_mod=. if MP_Occupation_Code_Mod==. 

	/* 1824-1923: Minister of Religion (4): Mufti, Naqib al-Ashraf, 'Alim */


gen top_bureaucrat_mod= MP_Occupation_Code_Mod==20110
     replace top_bureaucrat_mod=. if MP_Occupation_Code_Mod==. 

	/* 1824-1923: Legislative Official (6)*/

gen bureaucrat_mod= MP_Occupation_Code_Mod==20210 | MP_Occupation_Code_Mod==21000 | MP_Occupation_Code_Mod==31000 | MP_Occupation_Code_Mod==31010 | MP_Occupation_Code_Mod==31030 | MP_Occupation_Code_Mod==31040 | MP_Occupation_Code_Mod==31090
     replace bureaucrat_mod=. if MP_Occupation_Code_Mod==. 

	/* 1824-1923: Government Administrator (37), Manager (10), Government Executive Official (3)*/

gen village_headman_mod= MP_Occupation_Code_Mod==58410 
   replace village_headman_mod=. if MP_Occupation_Code_Mod==. 

	 /* 1824-1923: Village headman (409) */

gen ayan_mod= MP_Occupation_Code_Mod==61120 | MP_Occupation_Code_Mod==61110
   replace ayan_mod=. if MP_Occupation_Code_Mod==. 

	 /* 1824-1923: Notable (312) */
	 
gen missing_mod= MP_Occupation_Code_Mod==. 


label variable MP_Occupation_Code_Mod "MP Initial Occupation Code (HISCO)"
label variable occupationshift "= 1 if MP Occupation Different from Initial Occupation"
label variable occupationshift_missing "= 1 if MP Occupation Missing and Initial Occupation Non-Missing"
label variable occupationshift_real "= 1 if MP Occupation Non-Missing and Different from Initial Occupation"
label variable MPoccupationshift "= 1 if MP Had At Least One Change in Occupation"
label variable MPoccupationshift_missing "= 1 if MP Had At Least One Occupation Missing and Initial Occupation Non-Missing"
label variable MPoccupationshift_real "= 1 if MP Had At Least One Real Occupation Change"
label variable MPoccupationshift_missingonly "= 1 if MP Occupation Changes Are All from Missing"
label variable top_bureaucrat "= 1 if MP-Session is Top Bureaucrat"
label variable professional "= 1 if MP-Session is Professional"
label variable businessman "= 1 if MP-Session is Business"
label variable religious_elite "= 1 if MP-Session is Religious Elite"
label variable bureaucrat "= 1 if MP-Session is Bureaucrat"
label variable village_headman "= 1 if MP-Session is Village Headman"
label variable ayan "= 1 if MP-Session is Notable"
label variable missing "= 1 if MP-Session is Missing Occupation"
label variable top_bureaucrat_mod "= 1 if MP Initial Occupation is Top Bureaucrat"
label variable professional_mod "= 1 if MP Initial Occupation is Professional"
label variable businessman_mod "= 1 if MP Initial Occupation is Business"
label variable religious_elite_mod "= 1 if MP Initial Occupation is Religious Elite"
label variable bureaucrat_mod "= 1 if MP Initial Occupation is Bureaucrat"
label variable village_headman "= 1 if MP Initial Occupation is Village Headman"
label variable ayan_mod "= 1 if MP Initial Occupation is Notable"
label variable missing_mod "= 1 if MP Occupation is All Missing"



*-------------------------------------------------------------------------------
*--------I.3. Reassignment of MP honorific title based on MP history------------
*-------------------------------------------------------------------------------


*Modify honorific title by assigning MP's honorific title that appears in MP's first parliament with non-missing honorific title to all the MP's other terms in parliament

gen Full_Title_NM = Full_Title != ""

gen Full_Title_mod = Full_Title
gsort MP_UID Full_Title_NM -cycle 
by MP_UID: replace Full_Title_mod=Full_Title[_N]  //157 changes

drop Full_Title_NM

*Creating Identifiers for MPs who Changed Honorific Title

gen titleshift = Full_Title != Full_Title_mod
gen titleshift_missing = (Full_Title != Full_Title_mod) & Full_Title_mod != "" & Full_Title == ""
gen titleshift_real = (Full_Title != Full_Title_mod) & Full_Title_mod != "" & Full_Title != ""

bysort MP_UID: egen MPtitleshift = max(titleshift)
bysort MP_UID: egen MPtitleshift_missing = max(titleshift_missing)
bysort MP_UID: egen MPtitleshift_real = max(titleshift_real)

gen MPtitleshift_missingonly = MPtitleshift_missing & !MPtitleshift_real

tab1 titleshift titleshift_missing titleshift_real
tab1 MPtitleshift MPtitleshift_missing MPtitleshift_missingonly MPtitleshift_real if MP_first

tab Full_Title_mod Full_Title if titleshift_real

***There are 93 MPs (157 observations) where the MP's first non-missing honorific title in their first term in parliament is different from their current honorific title. Out of these, 7 MPs (9 observations) have a missing current honorific title, and were assigned a non-missing initial honorific title. The remaining 86 MPs (148 observations) wintessed at least one REAL shift, where the current honorific title is not missing, and is different from the initial honorific title. These 148 observations are distributed as follows: 
*Effendi (45) ---> Effendi (Bek) (2), Sheikh (Bek) (1), Pasha (1), Bek (40), Bek (Pasha) (1)
*Effendi (Bek) (8) ---> Bek (8)
*Hajj (2) ---> Sheikh (1), Bek (1)
*Hajj Ra'ees (5) ---> Effendi (1), Effendi (Bek) (1), Sheikh (2), Bek (1)
*Sheikh (57) ---> Effendi (24), Effendi (Bek) (1), Pasha (1), Bek (27), Bek (Pasha) (4)
*Sheikh Hajj (1) ---> Effendi (1)
*Mo'allem (3) ---> Effendi (3)
*Pasha (1) ---> Doctor (Pasha) (1)
*Bek (22) ---> Pasha (17), Bek (Pasha) (5)
*Bek (Pasha) (4) ---> Pasha (4)



*Generate dummy variables for honorific title groups based on current honorific title

gen mp_sheikh = Full_Title == "الشيخ" | Full_Title == "الشيخ (بك)"
replace mp_sheikh = . if Full_Title == ""

gen mp_effendi = Full_Title == "أفندي" | Full_Title == "أفندي (بك)" | Full_Title == "الأنبا أفندي"
replace mp_effendi = . if Full_Title == ""

gen mp_bek = Full_Title == "بك" | Full_Title == "بك (باشا)"
replace mp_bek = . if Full_Title == ""

gen mp_pasha = Full_Title == "باشا" | Full_Title == "الأمير باشا"
replace mp_pasha = . if Full_Title == ""

gen mp_othertitle = mp_sheikh == 0 & mp_effendi == 0 & mp_bek == 0 & mp_pasha == 0
replace mp_othertitle = . if Full_Title == ""

gen mp_missingtitle = Full_Title == ""



*Generate dummy variables for different honorific title groups based on initial honorific title

gen mp_sheikh_mod = Full_Title_mod == "الشيخ" | Full_Title_mod == "الشيخ (بك)"
replace mp_sheikh_mod = . if Full_Title_mod == ""

gen mp_effendi_mod = Full_Title_mod == "أفندي" | Full_Title_mod == "أفندي (بك)" | Full_Title_mod == "الأنبا أفندي"
replace mp_effendi_mod = . if Full_Title_mod == ""

gen mp_bek_mod = Full_Title_mod == "بك" | Full_Title_mod == "بك (باشا)"
replace mp_bek_mod = . if Full_Title_mod == ""

gen mp_pasha_mod = Full_Title_mod == "باشا" | Full_Title_mod == "الأمير باشا"
replace mp_pasha_mod = . if Full_Title_mod == ""

gen mp_othertitle_mod = mp_sheikh_mod == 0 & mp_effendi_mod == 0 & mp_bek_mod == 0 & mp_pasha_mod == 0
replace mp_othertitle_mod = . if Full_Title_mod == ""

gen mp_missingtitle_mod = Full_Title_mod == ""


label variable Full_Title_mod "MP Initial Honorific Title (Arabic)"
label variable titleshift "= 1 if MP Title Different from Initial Title"
label variable titleshift_missing "= 1 if MP Title Missing and Initial Title Non-Missing"
label variable titleshift_real "= 1 if MP Title Non-Missing and Different from Initial Title"
label variable MPtitleshift "= 1 if MP Had At Least One Change in Title"
label variable MPtitleshift_missing "= 1 if MP Had At Least One Title Missing and Initial Title Non-Missing"
label variable MPtitleshift_real "= 1 if MP Had At Least One Real Title Change"
label variable MPtitleshift_missingonly "= 1 if MP Title Changes Are All from Missing"
label variable mp_sheikh "= 1 if MP-Session is Sheikh"
label variable mp_effendi "= 1 if MP-Session is Effendi"
label variable mp_bek "= 1 if MP-Session is Bey"
label variable mp_pasha "= 1 if MP-Session is Pasha"
label variable mp_othertitle "= 1 if MP-Session is Other Title"
label variable mp_missingtitle "= 1 if MP-Session is Missing Title"
label variable mp_sheikh_mod "= 1 if Initial Title is Sheikh"
label variable mp_effendi_mod "= 1 if Initial Title is Effendi"
label variable mp_bek_mod "= 1 if MP Initial Title is Bey"
label variable mp_pasha_mod "= 1 if MP Initial Title is Pasha"
label variable mp_othertitle_mod "= 1 if MP Initial Title is Other"
label variable mp_missingtitle_mod "= 1 if MP Title is All Missing"




*-------------------------------------------------------------------------------
*---------------I.4. Classification of MPs into Social Classes------------------
*-------------------------------------------------------------------------------


*Social Class Origin (Session-Invariant)
//(1) Is MP Top Bureaucrat, Pasha, or Bek? If Yes: Landed Elite. If No: (2) Is MP Assigned to a Constituency? If No: Missing Social Class. If Yes: (3) Is the Constituency Urban or Rural? If Urban: (4a) Does MP Have a Non-Missing Occupational Title (Professional, Religious Elite, Village Headman, Notable, Bureaucrat) or a Non-Missing Honorific Title (Effendi, Sheikh, Other)? If Yes: Urban Middle Class. If No: Missing Social Class. If Rural: (4b) Does MP Have a Non-Missing Occupational Title (Professional, Religious Elite, Village Headman, Notable, Bureaucrat) or a Non-Missing Honorific Title (Effendi, Sheikh, Other)? If Yes: Rural Middle Class. If No: Missing Social Class.

*Landed Elite: 177 MPs (289 observations)
gen aristocracy = top_bureaucrat_mod == 1 | mp_pasha_mod == 1 | mp_bek_mod == 1
replace aristocracy = 0 if village_headman_mod == 1

*Rural Middle Class: 509 MPs (679 observations)
gen ruralbourgeoisie = ((urban == 0 & aristocracy == 0) & (bureaucrat_mod == 1 | ayan_mod == 1 | village_headman_mod == 1 | professional_mod == 1 | businessman_mod == 1 | religious_elite_mod == 1 | mp_effendi_mod == 1 | mp_sheikh_mod == 1 | mp_othertitle_mod == 1))

*Urban Middle Class: 34 MPs (57 observations)
gen urbanbourgeoisie = ((urban == 1 & aristocracy == 0) & (bureaucrat_mod == 1 | ayan_mod == 1 | village_headman_mod == 1 | professional_mod == 1 | businessman_mod == 1 | religious_elite_mod == 1 | mp_effendi_mod == 1 | mp_sheikh_mod == 1 | mp_othertitle_mod == 1))

*Missing Social Class: 51 MPs (77 observations)
gen classmissing = urbanbourgeoisie == 0 & aristocracy == 0 & ruralbourgeoisie == 0


*Current Social Class (Session-Variant)

*Landed Elite: 369 observations
gen aristocracy1 = top_bureaucrat == 1 | mp_pasha == 1 | mp_bek == 1
replace aristocracy1 = 0 if village_headman == 1

*Rural Middle Class: 604 observations
gen ruralbourgeoisie1 = ((urban == 0 & aristocracy1 == 0) & (bureaucrat == 1 | ayan == 1 | village_headman == 1 | professional == 1 | businessman == 1 | religious_elite == 1 | mp_effendi == 1 | mp_sheikh == 1 | mp_othertitle == 1)) 

*Urban Middle Class: 47 observations
gen urbanbourgeoisie1 = ((urban == 1 & aristocracy1 == 0) & (bureaucrat == 1 | ayan == 1 | village_headman == 1 | professional == 1 | businessman == 1 | religious_elite == 1 | mp_effendi == 1 | mp_sheikh == 1 | mp_othertitle == 1)) 

*Missing Social Class: 82 observations
gen classmissing1 = urbanbourgeoisie1 == 0 & aristocracy1 == 0 & ruralbourgeoisie1 == 0


label variable aristocracy "MP Social Class Origin is Landed Elite"
label variable ruralbourgeoisie "MP Social Class Origin is Rural Middle Class"
label variable urbanbourgeoisie "MP Social Class Origin is Urban Middle Class"
label variable classmissing "MP Social Class Origin is Missing"
label variable aristocracy1 "MP-Session Social Class is Landed Elite"
label variable ruralbourgeoisie1 "MP-Session Social Class is Rural Middle Class"
label variable urbanbourgeoisie1 "MP-Session Social Class is Urban Middle Class"
label variable classmissing1 "MP-Session Social Class is Missing"


save "ID1_Egypt_MPs_1824_1923.dta", replace


*-------------------------------------------------------------------------------
*--------I.5. New Entrant/Incumbent, Elected/Appointed, Lower/Upper House-------
*-------------------------------------------------------------------------------

*Create a derivative dataset at the family name level that identifies the first session in which a family name first appears in parliament
use "ID1_Egypt_MPs_1824_1923.dta", clear

drop if MPD_ID == . //87 MP-session observations with missing family name
bysort MPD_ID cycle: keep if _n == 1 //140 obs deleted
bysort MPD_ID: gen MPD_first = 1 if _n == 1 //There are 389 unique family names

*Generate dummy variable at the family name level that equals one in the first session a family appears in parliament
gen firsttimer_D = 0
sort MPD_ID cycle
bysort MPD_ID: replace firsttimer_D = 1 if _n==1 //389

keep MPD_ID cycle firsttimer_D

label variable MPD_ID "MP Family Name Unique Identifier"
label variable cycle "Parliamentary Session"
label variable firsttimer_D "=1 if MP Dynasty First Session"

save "TD1_firsttimer_D.dta", replace


*Generate variables tracing MP and dyanstic persistence
use "ID1_Egypt_MPs_1824_1923.dta", clear

*Generate dummy variable that equals one in the first session of the MP
gen firsttimer = 0
sort MP_UID cycle
bysort MP_UID: replace firsttimer = 1 if _n==1 //771

*Merge: Generate dummy variable that equals one for MPs who belong to a family that serves for the first time in a given parliamentary session (defined among MPs with non-missing family name)
merge m:1 MPD_ID cycle using TD1_firsttimer_D.dta, nogen keep(match master) //1015 matched

erase "TD1_firsttimer_D.dta"

*Count the number of families
bysort MPD_ID: gen MPD_first = 1 if _n == 1 //There are 389 unique family names
replace MPD_first = . if MPD_ID == . //1 change

*Generate dummy variable that equals 1 if incumbent at the MP and dyansty levels
gen incumbent = 1 - firsttimer
gen incumbent_D = 1 - firsttimer_D

*Generate dummy variable that equals 1 if appointed
gen Appointed = 1-Elected

*MP New Entrants, MP Incumbents, MP Dynasty New Entrants, MP Dynasty Incumbents, Elected, Appointed by Social Class Origin of MP
foreach x of varlist aristocracy ruralbourgeoisie urbanbourgeoisie{
	gen new`x' = `x'*firsttimer
	gen inc`x' = `x'*incumbent
	gen newD`x' = `x'*firsttimer_D
	gen incD`x' = `x'*incumbent_D
	gen elc`x' = `x'*Elected
	gen app`x' = `x'*Appointed
	gen UH`x' = `x'*house1
	gen LH`x' = `x'*house2
}


*Generate dummy variable for the post-1882 sessions
gen post1882 = cycle >= 6

*Labeling variables
label variable firsttimer "=1 if MP New Entrant"
label variable firsttimer_D "=1 if MP Family New Entrant"
label variable incumbent "=1 if MP Served At Least Once Before"
label variable incumbent_D "=1 if MP Family Served At Least Once Before"
label variable Appointed "=1 if MP Appointed"
label variable newaristocracy "=1 if MP LE & New Entrant"
label variable newruralbourgeoisie "=1 if MP RMC & New Entrant"
label variable newurbanbourgeoisie "=1 if MP UMC & New Entrant"
label variable incaristocracy "=1 if MP LE & Incumbent"
label variable incruralbourgeoisie "=1 if MP RMC & Incumbent"
label variable incurbanbourgeoisie "=1 if MP UMC & Incumbent"
label variable newDaristocracy "=1 if MP LE & Family New Entrant"
label variable newDruralbourgeoisie "=1 if MP RMC & Family New Entrant"
label variable newDurbanbourgeoisie "=1 if MP UMC & Family New Entrant"
label variable incDaristocracy "=1 if MP LE & Family Incumbent"
label variable incDruralbourgeoisie "=1 if MP RMC & Family Incumbent"
label variable incDurbanbourgeoisie "=1 if MP UMC & Family Incumbent"
label variable elcaristocracy "=1 if MP LE & Elected"
label variable elcruralbourgeoisie "=1 if MP RMC & Elected"
label variable elcurbanbourgeoisie "=1 if MP UMC & Elected"
label variable apparistocracy "=1 if MP LE & Appointed"
label variable appruralbourgeoisie "=1 if MP RMC & Appointed"
label variable appurbanbourgeoisie "=1 if MP UMC & Appointed"
label variable UHaristocracy "=1 if MP LE & Upper House in 1883-1913"
label variable UHruralbourgeoisie "=1 if MP RMC & Upper House in 1883-1913"
label variable UHurbanbourgeoisie "=1 if MP UMC & Upper House in 1883-1913"
label variable LHaristocracy "=1 if MP LE & Lower House in 1883-1913"
label variable LHruralbourgeoisie "=1 if MP RMC & Lower House in 1883-1913"
label variable LHurbanbourgeoisie "=1 if MP UMC & Lower House in 1883-1913"
label variable post1882 "=1 if Session Post 1882"


save "ID1_Egypt_MPs_1824_1923.dta", replace



*------------------------------------------------------------------------------------------------------------------------------------------------------------------
*------------------------------------------------------------------------------------------------------------------------------------------------------------------
*II. MERGER OF RAW DATASET 1 (RD1_Egypt_MPs_1824_1923) WITH PROVINCE-LEVEL DATASET (RD2_Regressors_Province) TO PRODUCE WORKING DATASET 1 (WD1_Egypt_MPs_1824_1923)
*------------------------------------------------------------------------------------------------------------------------------------------------------------------
*------------------------------------------------------------------------------------------------------------------------------------------------------------------


use "ID1_Egypt_MPs_1824_1923.dta", clear

merge m:1 provincecode_1882 using RD2_Regressors_Province, nogen

save "WD1_Egypt_MPs_1824_1923.dta", replace

erase "ID1_Egypt_MPs_1824_1923.dta"

